-
Notifications
You must be signed in to change notification settings - Fork 673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update vector search docs #18779
base: master
Are you sure you want to change the base?
update vector search docs #18779
Conversation
[LGTM Timeline notifier]Timeline:
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/approve cancel |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
||
> **Warning:** | ||
> | ||
> The vector search feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> The vector search feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. | |
> The vector search feature is experimental and some behaviors may change in future versions. It is not recommended that you use it in the production environment. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. |
Remove "This feature might be changed or removed without prior notice." as Vector Search is experimental because of stability issues, not product decision issues. Vector Search will never be removed.
|
||
> **Warning:** | ||
> | ||
> The vector search feature is in beta. It might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> The vector search feature is in beta. It might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. | |
> The vector search feature is in beta and some behaviors may change in future versions. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. |
CREATE TABLE foo ( | ||
id INT PRIMARY KEY, | ||
data VECTOR(5), | ||
data64 VECTOR64(10), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data64 VECTOR64(10), |
We do not support this syntax.
Signed-off-by: JaySon-Huang <[email protected]>
@qiancai: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
## From v7.x to v8.4 or a later version | ||
|
||
Starting from v8.4, the underlying storage format of TiFlash has been updated to support the [vector search](/vector-search-overview.md). Therefore, after the upgrade TiFlash to v8.4 or a later version, in-place downgrading to the original version is not supported. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I've pushed a commit about the tiflash upgrade notice
TiDB currently supports the following vector search index algorithm: | ||
|
||
- HNSW |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TiDB currently supports the following vector search index algorithm: | |
- HNSW | |
TiDB currently supports the [HNSW (Hierarchical Navigable Small World)](https://en.wikipedia.org/wiki/Hierarchical_navigable_small_world) vector search index algorithm. |
- TiFlash nodes must be deployed in your cluster in advance. | ||
- Vector search indexes cannot be used as primary keys or unique indexes. | ||
- Vector search indexes can only be created on a single vector column and cannot be combined with other columns (such as integers or strings) to form composite indexes. | ||
- A distance function must be specified when creating and using vector search indexes (currently, only cosine distance `VEC_COSINE_DISTANCE()` and L2 distance `VEC_L2_DISTANCE()` functions are supported). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- A distance function must be specified when creating and using vector search indexes (currently, only cosine distance `VEC_COSINE_DISTANCE()` and L2 distance `VEC_L2_DISTANCE()` functions are supported). | |
- A distance function must be specified when creating and using vector search indexes. Currently, only cosine distance `VEC_COSINE_DISTANCE()` and L2 distance `VEC_L2_DISTANCE()` functions are supported. |
> | ||
> The vector search feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub. | ||
|
||
</CustomContent> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<CustomContent platform="tidb-cloud"> | |
|
||
> **Note:** | ||
> | ||
> Vector search index is only available for [TiDB Cloud Serverless](/tidb-cloud/select-cluster-tier.md#tidb-cloud-serverless) clusters. | ||
> The vector search feature is only available for TiDB Self-Managed clusters and [TiDB Cloud Serverless](/tidb-cloud/select-cluster-tier.md#tidb-cloud-serverless) clusters. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
</CustomContent> | |
- A distance function must be specified when creating and using vector search indexes (currently, only cosine distance `VEC_COSINE_DISTANCE()` and L2 distance `VEC_L2_DISTANCE()` functions are supported). | ||
- For the same column, creating multiple vector search indexes using the same distance function is not supported. | ||
- Deleting columns with vector search indexes is not supported. Creating multiple indexes in the same statement is not supported. | ||
- Setting vector search indexes as [invisible](/sql-statements/sql-statement-alter-index.md) is not supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
补上了这条:
- 不支持修改带有向量索引的列的类型(有损变更,即修改了列数据)。
- Setting vector search indexes as [invisible](/sql-statements/sql-statement-alter-index.md) is not supported. | |
- Modifying the type of a column with a vector index is not supported (lossy change, that is, column data is modified). | |
- Setting vector search indexes as [invisible](/sql-statements/sql-statement-alter-index.md) is not supported. |
|
||
ALTER TABLE foo ADD VECTOR INDEX idx_name ((VEC_COSINE_DISTANCE(data))) USING HNSW; | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这注意建议删掉,在特性改为 GA 时很容易把此处漏改。L90 还有一处。
@@ -95,15 +156,15 @@ SELECT * FROM | |||
) t | |||
WHERE category = "document"; | |||
|
|||
-- Note that this query may return less than 5 results if some are filtered out. | |||
-- Note that this query might return less than 5 results if some are filtered out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- Note that this query might return less than 5 results if some are filtered out. | |
-- Note that this query might return fewer than 5 results if some are filtered out. |
@@ -163,9 +251,11 @@ SELECT * FROM INFORMATION_SCHEMA.TIFLASH_INDEXES; | |||
|
|||
For more information, see [`ALTER TABLE ... COMPACT`](/sql-statements/sql-statement-alter-table-compact.md). | |||
|
|||
In addition, you can monitor the execution progress of the DDL job by executing `ADMIN SHOW DDL JOBS;` and checking the `row count`. However, this method is not fully accurate, because the `row count` value is obtained from the `rows_stable_indexed` field in `TIFLASH_INDEXES`. This approach can used as a reference for tracking the progress of indexing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, you can monitor the execution progress of the DDL job by executing `ADMIN SHOW DDL JOBS;` and checking the `row count`. However, this method is not fully accurate, because the `row count` value is obtained from the `rows_stable_indexed` field in `TIFLASH_INDEXES`. This approach can used as a reference for tracking the progress of indexing. | |
In addition, you can monitor the execution progress of the DDL job by executing `ADMIN SHOW DDL JOBS;` and checking the `row count`. However, this method is not fully accurate, because the `row count` value is obtained from the `rows_stable_indexed` field in `TIFLASH_INDEXES`. You can use this approach as a reference for tracking the progress of indexing. |
|
||
- `<HOST>`: The host of the TiDB cluster. | ||
- `<PORT>`: The port of the TiDB cluster. | ||
- `<USER>`: The username to connect to the TiDB cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
中文版是 <USERNAME>
,需要保持一致
|
||
- `<HOST>`: The host of the TiDB cluster. | ||
- `<PORT>`: The port of the TiDB cluster. | ||
- `<USER>`: The username to connect to the TiDB cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
中文版为 <USERNAME>
|
||
- `<HOST>`: The host of the TiDB cluster. | ||
- `<PORT>`: The port of the TiDB cluster. | ||
- `<USER>`: The username to connect to the TiDB cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
中文版是 <USERNAME>
First-time contributors' checklist
What is changed, added or deleted? (Required)
This PR moves 15 vector search docs from the tidb-cloud folder to the vector-search folder to so they can be reused by TiDB self-managed docs.
Which TiDB version(s) do your changes apply to? (Required)
Tips for choosing the affected version(s):
By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.
For details, see tips for choosing the affected versions.
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?